智能论文笔记

Continuous Methods : Adaptively intrusive reduced order model closure

Emmanuel Menier , Michele Alessandro Bucci , Mouadh Yagoubi , Lionel Mathelin , Thibault Dairay , Raphael Meunier , Marc Schoenauer

分类：机器学习

2022-11-30

Reduced order modeling methods are often used as a mean to reduce simulation costs in industrial applications. Despite their computational advantages, reduced order models (ROMs) often fail to accurately reproduce complex dynamics encountered in real life applications. To address this challenge, we leverage NeuralODEs to propose a novel ROM correction approach based on a time-continuous memory formulation. Finally, experimental results show that our proposed method provides a high level of accuracy while retaining the low computational costs inherent to reduced models.

translated by 谷歌翻译

Continuous Methods : Hamiltonian Domain Translation

Emmanuel Menier , Michele Alessandro Bucci , Mouadh Yagoubi , Lionel Mathelin , Marc Schoenauer

分类：计算机视觉

2022-07-08

本文提出了一种新颖的域翻译方法。利用生成模型和动态系统之间建立的相似之处，我们提出了对循环构造的重新制定。通过将模型嵌入哈密顿结构，我们获得了一个连续，表现力且最重要的是域翻译的可逆生成模型。

translated by 谷歌翻译

Leveraging the structure of dynamical systems for data-driven modeling

Alessandro Bucci , Onofrio Semeraro , Alexandre Allauzen , Sergio Chibbaro , Lionel Mathelin

分类：机器学习

2021-12-15

许多科学领域需要对复杂系统的时间行为的可靠预测。然而，这种强烈的兴趣是通过建模问题阻碍：通常，描述所考虑的系统物理学的控制方程是不可访问的，或者在已知时，它们的解决方案可能需要与预测时间约束不兼容的计算时间。如今，以通用功能格式近似复杂的系统，并从可用观察中通知IT Nihilo已成为一个常见的做法，如过去几年出现的巨大科学工作所示。许多基于深神经网络的成功示例已经可用，尽管易于忽视了模型和保证边缘的概括性。在这里，我们考虑长期内存神经网络，并彻底调查训练集的影响及其结构对长期预测的质量。利用ergodic理论，我们分析了保证物理系统忠实模型的先验的数据量。我们展示了根据系统不变的培训集的知情设计如何以及潜在的吸引子的结构，显着提高了所产生的模型，在积极学习的背景下开放研究。此外，将说明依赖于存储器能够的模型时内存初始化的非琐碎效果。我们的调查结果为有效数据驱动建模的任何复杂动态系统所需的数量和选择提供了基于证据的良好实践。

translated by 谷歌翻译

Semi-supervised GAN for Bladder Tissue Classification in Multi-Domain Endoscopic Images

Jorge F. Lazo , Benoit Rosa , Michele Catellani , Matteo Fontana , Francesco A. Mistretta , Gennaro Musi , Ottavio de Cobelli , Michel de Mathelin , Elena De Momi

分类：计算机视觉 | 机器学习

2022-12-21

Objective: Accurate visual classification of bladder tissue during Trans-Urethral Resection of Bladder Tumor (TURBT) procedures is essential to improve early cancer diagnosis and treatment. During TURBT interventions, White Light Imaging (WLI) and Narrow Band Imaging (NBI) techniques are used for lesion detection. Each imaging technique provides diverse visual information that allows clinicians to identify and classify cancerous lesions. Computer vision methods that use both imaging techniques could improve endoscopic diagnosis. We address the challenge of tissue classification when annotations are available only in one domain, in our case WLI, and the endoscopic images correspond to an unpaired dataset, i.e. there is no exact equivalent for every image in both NBI and WLI domains. Method: We propose a semi-surprised Generative Adversarial Network (GAN)-based method composed of three main components: a teacher network trained on the labeled WLI data; a cycle-consistency GAN to perform unpaired image-to-image translation, and a multi-input student network. To ensure the quality of the synthetic images generated by the proposed GAN we perform a detailed quantitative, and qualitative analysis with the help of specialists. Conclusion: The overall average classification accuracy, precision, and recall obtained with the proposed method for tissue classification are 0.90, 0.88, and 0.89 respectively, while the same metrics obtained in the unlabeled domain (NBI) are 0.92, 0.64, and 0.94 respectively. The quality of the generated images is reliable enough to deceive specialists. Significance: This study shows the potential of using semi-supervised GAN-based classification to improve bladder tissue classification when annotations are limited in multi-domain data.

translated by 谷歌翻译

An Empirical Study of Library Usage and Dependency in Deep Learning Frameworks

Mohamed Raed El aoun , Lionel Nganyewou Tidjon , Ben Rombaut , Foutse Khomh , Ahmed E. Hassan

分类：人工智能

2022-11-28

Recent advances in deep learning (dl) have led to the release of several dl software libraries such as pytorch, Caffe, and TensorFlow, in order to assist machine learning (ml) practitioners in developing and deploying state-of-the-art deep neural networks (DNN), but they are not able to properly cope with limitations in the dl libraries such as testing or data processing. In this paper, we present a qualitative and quantitative analysis of the most frequent dl libraries combination, the distribution of dl library dependencies across the ml workflow, and formulate a set of recommendations to (i) hardware builders for more optimized accelerators and (ii) library builder for more refined future releases. Our study is based on 1,484 open-source dl projects with 46,110 contributors selected based on their reputation. First, we found an increasing trend in the usage of deep learning libraries. Second, we highlight several usage patterns of deep learning libraries. In addition, we identify dependencies between dl libraries and the most frequent combination where we discover that pytorch and Scikit-learn and, Keras and TensorFlow are the most frequent combination in 18% and 14% of the projects. The developer uses two or three dl libraries in the same projects and tends to use different multiple dl libraries in both the same function and the same files. The developer shows patterns in using various deep-learning libraries and prefers simple functions with fewer arguments and straightforward goals. Finally, we present the implications of our findings for researchers, library maintainers, and hardware vendors.

translated by 谷歌翻译

Reliable Malware Analysis and Detection using Topology Data Analysis

Lionel Nganyewou Tidjon , Foutse Khomh

分类：人工智能 | 机器学习

2022-11-03

Increasingly, malwares are becoming complex and they are spreading on networks targeting different infrastructures and personal-end devices to collect, modify, and destroy victim information. Malware behaviors are polymorphic, metamorphic, persistent, able to hide to bypass detectors and adapt to new environments, and even leverage machine learning techniques to better damage targets. Thus, it makes them difficult to analyze and detect with traditional endpoint detection and response, intrusion detection and prevention systems. To defend against malwares, recent work has proposed different techniques based on signatures and machine learning. In this paper, we propose to use an algebraic topological approach called topological-based data analysis (TDA) to efficiently analyze and detect complex malware patterns. Next, we compare the different TDA techniques (i.e., persistence homology, tomato, TDA Mapper) and existing techniques (i.e., PCA, UMAP, t-SNE) using different classifiers including random forest, decision tree, xgboost, and lightgbm. We also propose some recommendations to deploy the best-identified models for malware detection at scale. Results show that TDA Mapper (combined with PCA) is better for clustering and for identifying hidden relationships between malware clusters compared to PCA. Persistent diagrams are better to identify overlapping malware clusters with low execution time compared to UMAP and t-SNE. For malware detection, malware analysts can use Random Forest and Decision Tree with t-SNE and Persistent Diagram to achieve better performance and robustness on noised data.

translated by 谷歌翻译

Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems

Fitash Ul Haq , Donghwan Shin , Lionel Briand

分类：机器学习

2022-10-27

Deep Neural Networks (DNNs) have been widely used to perform real-world tasks in cyber-physical systems such as Autonomous Driving Systems (ADS). Ensuring the correct behavior of such DNN-Enabled Systems (DES) is a crucial topic. Online testing is one of the promising modes for testing such systems with their application environments (simulated or real) in a closed loop taking into account the continuous interaction between the systems and their environments. However, the environmental variables (e.g., lighting conditions) that might change during the systems' operation in the real world, causing the DES to violate requirements (safety, functional), are often kept constant during the execution of an online test scenario due to the two major challenges: (1) the space of all possible scenarios to explore would become even larger if they changed and (2) there are typically many requirements to test simultaneously. In this paper, we present MORLOT (Many-Objective Reinforcement Learning for Online Testing), a novel online testing approach to address these challenges by combining Reinforcement Learning (RL) and many-objective search. MORLOT leverages RL to incrementally generate sequences of environmental changes while relying on many-objective search to determine the changes so that they are more likely to achieve any of the uncovered objectives. We empirically evaluate MORLOT using CARLA, a high-fidelity simulator widely used for autonomous driving research, integrated with Transfuser, a DNN-enabled ADS for end-to-end driving. The evaluation results show that MORLOT is significantly more effective and efficient than alternatives with a large effect size. In other words, MORLOT is a good option to test DES with dynamically changing environments while accounting for multiple safety requirements.

translated by 谷歌翻译

Baking in the Feature: Accelerating Volumetric Segmentation by Rendering Feature Maps

Kenneth Blomqvist , Lionel Ott , Jen Jen Chung , Roland Siegwart

分类：计算机视觉 | 机器人

2022-09-26

最近已经提出了方法，仅使用稀疏语义注释像素的形式使用颜色图像和专家监督，将密度段3D卷成类。尽管令人印象深刻，但这些方法仍然需要相对较大的监督和对象进行分割可能需要几分钟的实践。这样的系统通常仅在其拟合的特定场景上优化其表示形式，而无需利用先前看到的图像中的任何先前信息。在本文中，我们建议使用在大型现有数据集中训练的模型提取的功能，以提高细分性能。我们通过体积渲染特征图和从每个输入图像提取的特征进行监督，将此特征表示形式烘烤到神经辐射场（NERF）中。我们表明，通过将此表示形式烘烤到NERF中，我们可以使后续的分类任务更加容易。我们的实验表明，与在各种场景中现有方法相比，我们的方法具有更高的分割精度，语义注释较少。

translated by 谷歌翻译

Learning Agent-Aware Affordances for Closed-Loop Interaction with Articulated Objects

Giulio Schiavi , Paula Wulkop , Giuseppe Rizzi , Lionel Ott , Roland Siegwart , Jen Jen Chung

分类：机器人

2022-09-13

对于移动机器人而言，与铰接式对象的交互是一项具有挑战性但重要的任务。为了应对这一挑战，我们提出了一条新型的闭环控制管道，该管道将负担能力估计的操纵先验与基于采样的全身控制相结合。我们介绍了完全反映了代理的能力和体现的代理意识提供的概念，我们表明它们的表现优于其最先进的对应物，这些对应物仅以最终效果的几何形状为条件。此外，发现闭环负担推论使代理可以将任务分为多个非连续运动，并从失败和意外状态中恢复。最后，管道能够执行长途移动操作任务，即在现实世界中开放和关闭烤箱，成功率很高（开放：71％，关闭：72％）。

translated by 谷歌翻译

Fast and Accurate Importance Weighting for Correcting Sample Bias

Antoine de Mathelin , Francois Deheeger , Mathilde Mougeot , Nicolas Vayatis

分类：机器学习

2022-09-09

对于适当的统计估计，数据集中的偏差可能非常有害。为了应对这个问题，已经开发了重要的加权方法，以将任何有偏分的分布与其相应的目标无偏分布相匹配。如今，开创性内核平均匹配（KMM）方法仍然被认为是该研究领域的最新技术。但是，该方法的主要缺点之一是大型数据集的计算负担。基于Huang等人的先前作品。（2007）和De Mathelin等。（2021），我们得出了一种新颖的重要性加权算法，该算法通过使用神经网络预测实例权重来扩展到大型数据集。我们在多个公共数据集上显示，在各种样本偏见下，我们提出的方法大大减少了大数据集上的计算时间，同时与其他重要的加权方法相比，保持了相似的样本偏差校正性能。所提出的方法似乎是唯一能够在合理时间内使用多达200万个数据的大型数据集进行相关重新加权的方法。

translated by 谷歌翻译